DPF Book Template - RDG for DPF with OVN-Kubernetes and HBN Services Demo

Kubespray Deployment and Configuration

In this solution, the Kubernetes (K8s) cluster is deployed using a modified Kubespray (based on tag v2.26.0) with a non-root depuser account from the Jump Node. The modifications in Kubespray are designed to meet the DPF prerequisites as described in the User Manual and facilitate cluster deployment and scaling.

  1. Download the modified Kubespray archive: modified_kubespray_v2.26.0.tar.gz.

  2. Extract the contents and navigate to the extracted directory:

    Jump Node Console

    Copy
    Copied!
                

    $ tar -xzf /home/depuser/modified_kubespray_v2.26.0.tar.gz $ cd kubespray/ depuser@jump:~/kubespray$

  3. Set the K8s API VIP address and DNS record. Replace it with your own IP address and DNS record if different:

    Jump Node Console

    Copy
    Copied!
                

    depuser@jump:~/kubespray$ sed -i '/ #kube_vip_address:/s/.*/kube_vip_address: 10.0.110.10/' inventory/mycluster/group_vars/k8s_cluster/addons.yml depuser@jump:~/kubespray$ sed -i '/apiserver_loadbalancer_domain_name:/s/.*/apiserver_loadbalancer_domain_name: "kube-vip.dpf.rdg.local.domain"/' roles/kubespray-defaults/defaults/main/main.yml

  4. Install the necessary dependencies and set up the Python virtual environment:

    Jump Node Console

    Copy
    Copied!
                

    depuser@jump:~/kubespray$ sudo apt -y install python3-pip jq python3.12-venv depuser@jump:~/kubespray$ python3 -m venv .venv depuser@jump:~/kubespray$ source .venv/bin/activate (.venv) depuser@jump:~/kubespray$ python3 -m pip install --upgrade pip (.venv) depuser@jump:~/kubespray$ pip install -U -r requirements.txt (.venv) depuser@jump:~/kubespray$ pip install ruamel-yaml

  5. Review and edit the inventory/mycluster/hosts.yaml file to define the cluster nodes. The following is the configuration for this deployment:

    Note
    • All of the nodes are already labeled and annotated as per DPF user manual prerequisites.

    • The worker nodes include additional kubelet configuration which will be applied during their deployment to achieve best performance, allowing:

      • Containers in Guaranteed pods with integer CPU requests access to exclusive CPUs on the node.

      • Reserve some cores for the system using the reservedSystemCPUs option (kubelet requires a CPU reservation greater than zero to be made when the static policy is enabled), and make sure they belong to NUMA 0 (because the NIC in the example is wired to NUMA node 1, use cores from NUMA 1 if the NIC is wired to NUMA node 0).

      • Define the topology to be single-numa-node so it only allows a pod to be admitted if all requested CPUs and devices can be allocated from exactly one NUMA node.

    • The kube_node group is marked with # to only deploy the cluster with control plane nodes at the beginning (worker nodes will be added later on after the various components that are necessary for the DPF system are installed).

    inventory/mycluster/hosts.yaml

    Copy
    Copied!
                

    all: hosts: master1: ansible_host: 10.0.110.1 ip: 10.0.110.1 access_ip: 10.0.110.1 node_labels: "k8s.ovn.org/zone-name": "master1" master2: ansible_host: 10.0.110.2 ip: 10.0.110.2 access_ip: 10.0.110.2 node_labels: "k8s.ovn.org/zone-name": "master2" master3: ansible_host: 10.0.110.3 ip: 10.0.110.3 access_ip: 10.0.110.3 node_labels: "k8s.ovn.org/zone-name": "master3" worker1: ansible_host: 10.0.110.21 ip: 10.0.110.21 access_ip: 10.0.110.21 node_labels: "node-role.kubernetes.io/worker": "" "k8s.ovn.org/dpu-host": "" "k8s.ovn.org/zone-name": "worker1" node_annotations: "k8s.ovn.org/remote-zone-migrated": "worker1" kubelet_cpu_manager_policy: static kubelet_topology_manager_policy: single-numa-node kubelet_reservedSystemCPUs: 0-7 worker2: ansible_host: 10.0.110.22 ip: 10.0.110.22 access_ip: 10.0.110.22 node_labels: "node-role.kubernetes.io/worker": "" "k8s.ovn.org/dpu-host": "" "k8s.ovn.org/zone-name": "worker2" node_annotations: "k8s.ovn.org/remote-zone-migrated": "worker2" kubelet_cpu_manager_policy: static kubelet_topology_manager_policy: single-numa-node kubelet_reservedSystemCPUs: 0-7 children: kube_control_plane: hosts: master1: master2: master3: kube_node: hosts: worker1: worker2: etcd: hosts: master1: master2: master3: k8s_cluster: children: kube_control_plane: # kube_node:

© Copyright 2025, NVIDIA. Last updated on Jul 10, 2025.